#PART 1: Equity Analysis of Health Insurance Coverage in the Bay Area

#PART 2: Mapping healthcare access in the Bay Area by race using census data

Percent of peolpe with healthcare coverage

Difference between percent of white peolpe and percent of nonwhite people with healthcare coverage

#PART 3: Analyzing cardiovascular disease prevelance and its relation to environmental health risk exposure using CalEnviroScreen data

For CalEnviroScreen purposes, cardiovascular disease is defined by “Spatially modeled, age-adjusted rate of emergency department (ED) visits for AMI per 10,000 (averaged over 2015-2017).” People without health insurance are probably less likely to visit the emergency room, even if they are in need. Which is why it would also be useful to map the greatest predictor. We should also compare the Bay Area map of CVD and percent of people with health insurance.

Rates of cardiovascular disease in the Bay Area

Next, we will consider different enviornmental indicators as predictors of cardiovascular diease.

Pesticides

## 
## Call:
## lm(formula = log(ces4_bay_data$"Cardiovascular Disease") ~ Pesticides, 
##     data = ces4_bay_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.94420 -0.25750 -0.00674  0.24996  0.85141 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 2.303e+00  9.064e-03 254.038  < 2e-16 ***
## Pesticides  1.822e-04  5.619e-05   3.242  0.00121 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3585 on 1578 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.006615,   Adjusted R-squared:  0.005986 
## F-statistic: 10.51 on 1 and 1578 DF,  p-value: 0.001213

The mean of the residuals is close to zero and curve is fairly symmetric. R squared is quite low (variation in pesticides only explains .66% of variation in cardiovascular disease).

Takeaway: pesticides are a poor predictor of cardiovascular disease

Drinking water quality

## 
## Call:
## lm(formula = ces4_bay_data$"Cardiovascular Disease" ~ ces4_bay_data$"Drinking Water", 
##     data = ces4_bay_data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -7.0841 -2.9093 -0.6266  2.1625 12.8523 
## 
## Coefficients:
##                                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    11.6601704  0.2442778  47.733  < 2e-16 ***
## ces4_bay_data$"Drinking Water" -0.0034656  0.0008067  -4.296 1.85e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 3.911 on 1576 degrees of freedom
##   (3 observations deleted due to missingness)
## Multiple R-squared:  0.01157,    Adjusted R-squared:  0.01095 
## F-statistic: 18.45 on 1 and 1576 DF,  p-value: 1.847e-05

The residuals are somewhat symetrially distributed around zero if you look at summary alone, but plot shows that curve is skewed. R squared still quite low, variation in drinking water only explains 1.15% of variation in cardiovascular disease

PM2.5

## 
## Call:
## lm(formula = log(ces4_bay_data$"Cardiovascular Disease") ~ ces4_bay_data$PM2.5, 
##     data = ces4_bay_data)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.94804 -0.25783 -0.00379  0.23135  0.84283 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          1.76764    0.12438  14.211  < 2e-16 ***
## ces4_bay_data$PM2.5  0.06343    0.01463   4.336 1.54e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.3576 on 1578 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.01177,    Adjusted R-squared:  0.01115 
## F-statistic:  18.8 on 1 and 1578 DF,  p-value: 1.545e-05

Residuals are centered at zero and relatively symetric. R squared is low, variation in PM2.5 only explains 1.17% of variation in cardiovascular disease

From this, we see that the best predictor is PM2.5, but it is still a very weak predictor.

#NOT SURE THAT I WILL INCLUDE THIS PART (Not sure how to tie it in to the reset of the project, but I have the code so wanted to include it for now) PART 4: Mapping private health insurance coverage by white immigrant and non-white immigrant populations in the Bay Area

Out of interest, I used PUMS data to look at the intersecting identities of race and immigration status, and how this may relate to health insurance in the Bay Area.

Percent of white and non-white households with health insurance coverage in Bay Area PUMAS

Difference between the percent of white and non-white households with health insurance coverage in Bay Area PUMAS

Averaging across PUMAs, we see that the discrepancy between the percent of white and non-white households with private health insurance is greatest in Sonoma County.